A Maximum Profit Coverage Algorithm with Application to Small Molecules Cluster Identification

نویسندگان

  • Refael Hassin
  • Einat Or
چکیده

In this article we model, and analyze the cluster identification of molecules (CIM), which is a clustering problem in a finite metric space. CIM2 has the following characteristics which separate it from other clustering models: 1. In most models outliers are a small portion of the data set, whereas in CIM they may be the vast majority of the objects. (see Figure 1) 2. The clusters identified by CIM are compact and their diameter is bounded. 3. There is a lower bound on the number of objects in a cluster. 4. Clusters may be very close to one another, as a result of the bound on the diameter. What may be considered as one cluster in other clustering models is considered as several clusters in CIM. (see Figure 2). 5. The number of clusters is not known a-priori to the clustering procedure. In this paper we present CIM and model it as a maximum profit coverage problem (MPCP). The model is a measure to be optimized, rather then a heuristic. Consider a finite set S in a metric space M with a distance function d. A ball with center t and radius r is the subset B(t, r) = {x ∈ M |d(t, x) ≤ r}. We say that the ball covers the points of S that it contains. Given a set of balls B of radius r, a coverage P = {S ′ 1, . . . , S′ l} is a set of clusters such that each of them consists of points covered by a single ball of B. Let S′ P = ∪i=1S i, and define the profit of P as ∑

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of A Route Expansion Algorithm for Transit Routes Design in Grid Networks

Establishing a network of transit routes with satisfactory demand coverage is one of the main goals of transitagencies in moving towards a sustainable urban development. A primary concern in obtaining such anetwork is reducing operational costs. This paper deals with the problem of minimizing construction costsin a grid transportation network while satisfying a certain level o...

متن کامل

Multi-layer Clustering Topology Design in Densely Deployed Wireless Sensor Network using Evolutionary Algorithms

Due to the resource constraint and dynamic parameters, reducing energy consumption became the most important issues of wireless sensor networks topology design. All proposed hierarchy methods cluster a WSN in different cluster layers in one step of evolutionary algorithm usage with complicated parameters which may lead to reducing efficiency and performance. In fact, in WSNs topology, increasin...

متن کامل

Test Power Reduction by Simultaneous Don’t Care Filling and Ordering of Test Patterns Considering Pattern Dependency

Estimating and minimizing the maximum power dissipation during testing is an important task in VLSI circuit realization since the power value affects the reliability of the circuits. Therefore during testing a methodology should be adopted to minimize power consumption. Test patterns generated with –D 1 option of ATALANTA contains don’t care bits (x bits). By suitable filling of don’t cares can...

متن کامل

The Ground-Set-Cost Budgeted Maximum Coverage Problem

We study the following natural variant of the budgeted maximum coverage problem: We are given a budget B and a hypergraph G = (V,E), where each vertex has a non-negative cost and a non-negative profit. The goal is to select a set of hyperedges T ⊆ E such that the total cost of the vertices covered by T is at most B and the total profit of all covered vertices is maximized. Besides being a natur...

متن کامل

Dynamic Coverage and Clustering: A Maximum Entropy Approach

We present a computational framework we have recently developed for solving a large class of dynamic coverage and clustering problems, ranging from those that arise in the deployment of mobile sensor networks to the identification of ensemble spike trains in computational neuroscience applications. This framework provides for the identification of natural clusters in an underlying dataset, whil...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006